Skip to content

added 3 more samples#5

Merged
khelanmodi merged 18 commits intomainfrom
more-samples
Mar 19, 2026
Merged

added 3 more samples#5
khelanmodi merged 18 commits intomainfrom
more-samples

Conversation

@khelanmodi
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds three new Python-based DocumentDB OSS sample applications to the samples gallery and registers them in registry.yml.

Changes:

  • Register 3 new samples in registry.yml (fraud detection multi-agent CLI, content semantic search Flask portal, clinical note similarity Flask portal).
  • Add fraud-detection-agent-py sample (data seeding, vector search retrieval agent, LLM analysis/decision agents).
  • Add content-semantic-search-py and clinical-note-similarity-py Flask samples (ingestion/upload scripts, search + detail pages, styling, sample datasets).

Reviewed changes

Copilot reviewed 39 out of 43 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
registry.yml Adds three new sample entries to the gallery registry.
fraud-detection-agent-py/utils/embeddings.py Ollama embeddings helper for the fraud detection sample.
fraud-detection-agent-py/utils/db.py Mongo/DocumentDB client + collection helpers for fraud detection sample.
fraud-detection-agent-py/utils/init.py Package marker for fraud sample utils.
fraud-detection-agent-py/upload_data.py Seeds transactions, generates embeddings, creates vector index.
fraud-detection-agent-py/requirements.txt Python dependencies for fraud sample.
fraud-detection-agent-py/README.md Documentation for running the fraud multi-agent pipeline.
fraud-detection-agent-py/main.py CLI entry point to run sample transactions through agents.
fraud-detection-agent-py/data/transactions.json Sample labeled transaction dataset for vector retrieval.
fraud-detection-agent-py/cleanup.py Drops the fraud sample collection.
fraud-detection-agent-py/agents/retrieval_agent.py Vector-search retrieval agent implementation.
fraud-detection-agent-py/agents/decision_agent.py LLM decision agent implementation.
fraud-detection-agent-py/agents/analysis_agent.py LLM analysis agent implementation.
fraud-detection-agent-py/agents/init.py Package marker for fraud sample agents.
fraud-detection-agent-py/.gitignore Ignores env/venv/build artifacts for fraud sample.
fraud-detection-agent-py/.env.example Example env vars for fraud sample.
content-semantic-search-py/utils/embeddings.py Ollama embeddings helper for semantic search sample.
content-semantic-search-py/utils/db.py Mongo/DocumentDB client + collection helpers for semantic search sample.
content-semantic-search-py/utils/init.py Package marker for semantic search utils.
content-semantic-search-py/templates/index.html Search UI template for semantic search portal.
content-semantic-search-py/templates/article.html Article detail template for semantic search portal.
content-semantic-search-py/static/style.css Styling for semantic search portal.
content-semantic-search-py/requirements.txt Python dependencies for semantic search sample.
content-semantic-search-py/README.md Documentation for semantic search portal + ingestion options.
content-semantic-search-py/ingest.py Ingestion script for sample/custom text/PDF documents + vector index.
content-semantic-search-py/data/articles.json Sample content dataset for semantic search.
content-semantic-search-py/app.py Flask app implementing semantic search + detail routes.
content-semantic-search-py/.gitignore Ignores env/venv/build artifacts and uploads folder.
content-semantic-search-py/.env.example Example env vars for semantic search sample.
clinical-note-similarity-py/utils/embeddings.py Ollama embeddings helper for clinical notes sample.
clinical-note-similarity-py/utils/db.py Mongo/DocumentDB client + collection helpers for clinical notes sample.
clinical-note-similarity-py/utils/init.py Package marker for clinical notes utils.
clinical-note-similarity-py/upload_notes.py Seeds clinical notes, generates embeddings, creates vector index.
clinical-note-similarity-py/templates/note.html Note detail template for clinical notes portal.
clinical-note-similarity-py/templates/index.html Search UI template for clinical notes portal.
clinical-note-similarity-py/static/style.css Styling for clinical notes portal.
clinical-note-similarity-py/requirements.txt Python dependencies for clinical notes sample.
clinical-note-similarity-py/README.md Documentation + disclaimer for clinical notes similarity explorer.
clinical-note-similarity-py/data/clinical_notes.json Fictional clinical note dataset for similarity search.
clinical-note-similarity-py/cleanup.py Drops the clinical notes collection.
clinical-note-similarity-py/app.py Flask app implementing similarity search + note detail routes.
clinical-note-similarity-py/.gitignore Ignores env/venv/build artifacts for clinical notes sample.
clinical-note-similarity-py/.env.example Example env vars for clinical notes sample.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +6 to +8
def get_client() -> MongoClient:
uri = os.environ["DOCUMENTDB_URI"]
return MongoClient(uri, tlsAllowInvalidCertificates=True)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines +107 to +108
print(f"Starting Clinical Note Similarity Explorer on http://localhost:{port}")
app.run(debug=True, port=port)
khelanmodi and others added 8 commits March 19, 2026 16:08
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
khelanmodi and others added 8 commits March 19, 2026 16:21
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds three new Python sample projects to the DocumentDB samples gallery and updates shared documentation to reflect recommended local-dev and vector-search usage patterns.

Changes:

  • Registered 3 new Python samples in registry.yml (fraud detection multi-agent, content semantic search portal, clinical note similarity explorer).
  • Added full sample implementations (CLI + Flask apps), including seed/ingest scripts and sample datasets.
  • Expanded SKILL.md with Docker Compose guidance, safer Python client patterns, and vector search tuning notes.

Reviewed changes

Copilot reviewed 40 out of 44 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
SKILL.md Updates the main skill doc with Docker Compose guidance, safer Python connection pattern, and vector search tuning sections.
registry.yml Adds 3 new sample entries to the gallery registry.
fraud-detection-agent-py/utils/embeddings.py Ollama embedding helper for the fraud detection sample.
fraud-detection-agent-py/utils/db.py DocumentDB (PyMongo) client/collection helpers for the fraud detection sample.
fraud-detection-agent-py/utils/init.py Package marker for fraud sample utilities.
fraud-detection-agent-py/upload_data.py Seeds fraud sample data, generates embeddings, and creates vector index.
fraud-detection-agent-py/requirements.txt Python dependencies for the fraud detection sample.
fraud-detection-agent-py/README.md Setup and usage documentation for the fraud detection sample.
fraud-detection-agent-py/main.py Runs the fraud multi-agent pipeline over example transactions.
fraud-detection-agent-py/data/transactions.json Sample transaction dataset with fraud labels for the fraud sample.
fraud-detection-agent-py/cleanup.py Convenience script to drop the fraud sample collection.
fraud-detection-agent-py/agents/retrieval_agent.py Retrieval agent implementation using DocumentDB vector search.
fraud-detection-agent-py/agents/decision_agent.py Decision agent implementation using Ollama chat endpoint.
fraud-detection-agent-py/agents/analysis_agent.py Analysis agent implementation using Ollama chat endpoint.
fraud-detection-agent-py/agents/init.py Package marker for fraud sample agents.
fraud-detection-agent-py/.gitignore Ignores env/venv/build artifacts for the fraud sample.
fraud-detection-agent-py/.env.example Example environment configuration for the fraud sample.
content-semantic-search-py/utils/embeddings.py Ollama embedding helper for the content semantic search sample.
content-semantic-search-py/utils/db.py DocumentDB (PyMongo) client/collection helpers with safer TLS flag handling.
content-semantic-search-py/utils/init.py Package marker for content sample utilities.
content-semantic-search-py/templates/index.html Search UI template for the content semantic search portal.
content-semantic-search-py/templates/article.html Article detail UI template for the content semantic search portal.
content-semantic-search-py/static/style.css Styling for the content semantic search portal UI.
content-semantic-search-py/requirements.txt Python dependencies for the content semantic search sample.
content-semantic-search-py/README.md Setup and usage documentation for the content semantic search sample.
content-semantic-search-py/ingest.py Ingests sample/custom text/PDF content, embeds, and creates vector index.
content-semantic-search-py/data/articles.json Sample articles dataset for content semantic search.
content-semantic-search-py/app.py Flask app implementing semantic search and article detail endpoints.
content-semantic-search-py/.gitignore Ignores env/venv/build artifacts (and uploads dir) for the content sample.
content-semantic-search-py/.env.example Example environment configuration for the content semantic search sample.
clinical-note-similarity-py/utils/embeddings.py Ollama embedding helper for the clinical note similarity sample.
clinical-note-similarity-py/utils/db.py DocumentDB (PyMongo) client/collection helpers with safer TLS flag handling.
clinical-note-similarity-py/utils/init.py Package marker for clinical sample utilities.
clinical-note-similarity-py/upload_notes.py Seeds clinical notes, generates embeddings, and creates vector index.
clinical-note-similarity-py/templates/note.html Full-note UI template for the clinical similarity explorer.
clinical-note-similarity-py/templates/index.html Search UI template for the clinical similarity explorer.
clinical-note-similarity-py/static/style.css Styling for the clinical similarity explorer UI.
clinical-note-similarity-py/requirements.txt Python dependencies for the clinical note similarity sample.
clinical-note-similarity-py/README.md Setup and usage documentation for the clinical note similarity sample.
clinical-note-similarity-py/data/clinical_notes.json Fictional, de-identified clinical notes dataset for the clinical sample.
clinical-note-similarity-py/cleanup.py Convenience script to drop the clinical notes collection.
clinical-note-similarity-py/app.py Flask app implementing similarity search and note detail endpoints.
clinical-note-similarity-py/.gitignore Ignores env/venv/build artifacts for the clinical sample.
clinical-note-similarity-py/.env.example Example environment configuration for the clinical note similarity sample.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +6 to +9
def get_client() -> MongoClient:
uri = os.environ["DOCUMENTDB_URI"]
return MongoClient(uri, tlsAllowInvalidCertificates=True)

Copy link

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_client() will raise a raw KeyError if DOCUMENTDB_URI is missing (via os.environ[...]), and it unconditionally sets tlsAllowInvalidCertificates=True. For samples, prefer os.getenv with a clear exit message when the URI is missing, and only enable invalid certs when an explicit env flag is set (or when connecting to the local dev container) to avoid hard-coding insecure defaults.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +84
query = request.form.get("query", "").strip()
specialty = request.form.get("specialty", "all")
num_results = int(request.form.get("num_results", 5))
specialties = get_specialties()

Copy link

Copilot AI Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_results = int(request.form.get('num_results', 5)) can raise ValueError if the request is tampered with (or the field is missing/empty), causing a 500. Consider parsing num_results with a try/except and clamping to a small set of allowed values (or at least >= 1) similar to the handling in content-semantic-search-py/app.py.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@khelanmodi khelanmodi merged commit f0e344d into main Mar 19, 2026
@khelanmodi khelanmodi deleted the more-samples branch March 19, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants